Skip to content

Conversation

@knfreemLD
Copy link
Contributor

@knfreemLD knfreemLD commented Jan 21, 2026

Requirements

  • I have added test coverage for new or changed functionality
  • I have followed the repository's pull request submission guidelines
  • I have validated my changes against all supported platform versions

Related issues

https://launchdarkly.atlassian.net/browse/REL-11511
See tech spec at https://docs.google.com/document/d/1lzYwQqCcTzN_2zkxJZDfJtgUcEJ4jbpx0KSsJ2bRENw/edit?tab=t.0#heading=h.69bdm7karsxh

Describe the solution you've provided

Updating the SDK to check the AI Config's evaluationMetricKey property which now exists. Also added missing tests from previous implementation, and fallback to the original evaluationMetricKeys list.

Describe alternatives you've considered

Provide a clear and concise description of any alternative solutions or features you've considered.

Additional context

Add any other context about the pull request here.


Note

Implements single-key judge evaluation with backward compatibility and comprehensive tests.

  • Switches judge configs to use evaluationMetricKey (deprecated evaluationMetricKeys), updating AIJudgeConfig(Default) serialization
  • LDAIClient.__evaluate now returns the raw variation; judge_config extracts evaluationMetricKey with fallback to first in evaluationMetricKeys
  • Judge updated to validate and parse a single metric; EvaluationSchemaBuilder builds a single-key structured schema; minor cleanup of unused imports/comments
  • Adds extensive unit tests for judge behavior, schema building, and client extraction (including consistency of single variation, sampling, error paths)

Written by Cursor Bugbot for commit c6d086a. This will update automatically on new commits. Configure here.

@knfreemLD knfreemLD changed the title [REL-11511] Add support for custom judges via evaluation metric key feat: add support for custom judges via evaluation metric key Jan 21, 2026
@knfreemLD knfreemLD requested a review from jsonbailey January 21, 2026 15:21
@knfreemLD knfreemLD marked this pull request as ready for review January 21, 2026 15:43
@knfreemLD knfreemLD requested a review from a team as a code owner January 21, 2026 15:43
Default Judge-specific AI Config with required evaluation metric key.
"""
messages: Optional[List[LDMessage]] = None
# Deprecated: evaluation_metric_key is used instead
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are sub 1.0 release as long as we can guarantee the api is always returning the new single key we should be able to just drop this and do a breaking change. They only thing that really makes this breaking is people will need to update their defaults if they defined it. If you want to drop it now update the PR to be "feat!: ".

I won't block if you want to leave this in for a little while but it likely isn't necessary. The real question is how long do we want to continue sending the old values in the API as that is what will break older SDKs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now we want to make sure this is non-breaking, but soon we're going to remove "legacy" support. For keeping this change as minimal and safe as possible I'd err on the side of caution and keep it in for the time being.

@knfreemLD knfreemLD requested a review from jsonbailey January 21, 2026 17:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants